55  000217.00 


-j  Professional  JPaper  |^Io.  217 

|Mar<fc*»78  / ' ' 

CM-~PP-2.-i.7j1 


The  ideas  expressed  in  this  paper  are  those  of  the 
author.  The  paper  does  not  necessarily  represent 
the  views  of  the  Center  for  Naval  Analyses. 


D D C 


CENTER  FOR  NAVAL  ANALYSES 

Ttr?r?Qr?nn  rjijr 

1401  Wilson  Boulevard  l 

Arlington,  Virginia  22209 

^ MAY  81  1978 

IklSlbU  U L5L 

D 


DISTRIBUTION  STATEMENT  A 

Approved  for  public  release; 

Distribution  Unlimited 

<277  cD  -7 <j>  sJtUz. 


PROFESSIONAL  PAPER  NO.  217 
MARCH  1978 


BIBLIOMETRIC  STUDIES  OF  SCIENTIFIC  PRODUCTIVITY 


Russell  C.  Coile 

Center  for  Naval  Analyses  Centre  for  Information  Science 

University  of  Rochester  The  City  University 

Arlington,  Virginia  London,  England 


(Paper  presented  at  the  annual  meeting  of  the 
American  Society  for  Information  Science  held 
in  San  Francisco,  California,  October  1976.) 


ABSTRACT 


In  1926,  Alfred  J.  Lotka  examined  the 
scientific  publishing  productivity  of  chemists. 
His  bibliometric  study  of  the  number  of  chem- 
ists listed  in  Chemical  Abstracts  who  had 
published  one,  two,  three,  etc.  papers  in  a 
ten-year  period  was  the  first  of  many  such 
studies.  Lotka  proposed  an  ^inverse  square 
law^  of  scientific  productivity  in  which  the 
frequency  of  authors  publishing  x papers 
varied  inversely  as  the  square  of  x.  Biblio- 
metric research  is  underway  to  explore  the 
applicability  of  other  frequency  distributions 
including  Fisher's  logarithmic  series.  Yule's 
Beta  function  and  the  Weibull  distribution. 
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BACKGROUND 

One  of  the  interesting  areas  in  information  science  is  the 
question  of  scientific  productivity.  How  many  papers  might  a 
chemist,  physicist,  mathematician,  biologist,  etc.  write  during 
his  or  her  professional  lifetime?  This  question  is  pertinent  to 
the  editors  of  technical  journals,  the  managers  of  abstract  services, 
the  librarians  who  work  with  journals,  abstracts  and  other  infor- 
mation retrieval  systems,  and  to  fellow  chemists  and  physicists 
in  their  research  laboratories. 


PRODUCTIVITY  OF  GREAT  SCIENTISTS 


Various  studies  of  productivity  have  been  conducted  over 
the  years.  Initially,  it  was  fashionable  to  count  the  number  of 
papers  written  by  great  men.  Table  1 shows  some  of  the  achieve- 
ments of  eminent  scientists  of  the  19th  century,  as  given  by 
Dennis,  reference  1.  Table  2 presents  the  number  of  papers 
listed  in  the  Biographical  Memoirs  (1943-1952)  of  members  of  the 
National  Academy  of  Sciences,  also  as  given  by  Dennis.  For 
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TABLE  1 

EMINENT  SCIENTISTS  OF  19th  CENTURY 
(30  Years  Assumed  for  Publishing  Period) 


Name 

Papers 

Papers/ 

Year 

Liebig 

307 

10 

Bertholet 

236 

8 

Pasteur 

172 

6 

Faraday 

161 

5 

Poisson 

158 

5 

Aggasiz 

153 

5 

Herschel 

151 

5 

Humboldt 

142 

5 

Gay-Lussac 

134 

4 

Gauss 

123 

4 

Kelvin 

114 

4 

Maxwell 

90 

3 

Joule 

89 

3 

Davy 

86 

3 

Helmholtz 

86 

3 

Lyell 

76 

3 

Hamilton 

71 

2 

Darwin 

61 

2 

Riemann^ 

19 

1 

*Died  at  age  40. 


Name 

Levene 

Merriam 

Stegneger 

Davis 

Barus 

Kennel ly 

Johnson 

Hrdlicka 

Campell 

Cushing 

Davenport 


TABLE  2 

NATIONAL  ACADEMY  OF  SCIENCES 
BIOGRAPHICAL  MEMOIRS  1943-1952 
(40  Years  Assumed) 


Field 

Papers 

Papers/ 

Year 

Biochemistry 

768 

19 

Zoology 

626 

16 

Zoology 

499 

12 

Geology 

477 

12 

Physics 

420 

11 

Engineering 

362 

9 

Chemistry 

358 

9 

Anthropology 

340 

9 

Astronomy 

330 

8 

Medicine 

306 

8 

Genetics 

303 

8 

3- 


example.  Professor  Kennelly  of  Harvard  published  362  papers  for  an 
average  of  about  9 papers  per  year.  The  assumption  was  usually  made 
in  these  studies  that  there  must  have  been  some  papers  of  high  quality 
associated  with  this  high  quantity  of  output. 

PRODUCTIVITY  OF  MR.  AVERAGE  SCIENTIST 

Bibliometric  studies  of  Mr.  Average  Scientist  have  used 
such  sources  as  Chemical  Abstracts  for  a data  base,  or  lists  of  pub- 
lications written  over  a 25-year  period  by  mathematicians,  etc. 

Table  3 shows  some  data  for  publications  of  chemists,  mathe- 
maticians, and  biologists.  Mr.  Average  Man  seems  to  have  published 
approximately  0.1  paper  per  year.  Also  this  data  is  only  for  the 
fraction  of  scientists  who  publish  at  least  one  paper  during  their 
lifetime. 

LOTKA'S  LAW 

One  of  the  pioneers  in  bibliometric  studies  of  scientific 
productivity  was  Alfred  Lotka,  reference  2,  who  published  a classic 
paper  in  1926  in  the  Journal  of  the  Washington  Academy  of  Science. 

His  paper  on  frequency  distributions  of  scientific  productivity 
presented  an  analysis  of  the  number  of  chemists  who  were  listed  in 
Chemical  Abstracts,  1907-1916,  as  having  published  one,  two,  three, 
etc.  papers.  Table  4 shows  his  data  for  the  senior  author  chemists 
whose  names  began  with  the  letters  A and  B.  Figure  1 illustrates 
the  percent  of  authors  with  various  numbers  of  abstracts.  Figure  2 
is  Lotka' s log- log  plot  of  this  data  showing^  for  the  chemists  who 
published,  how  approximately  57  percent  had  published  only  one  paper. 
The  slope  of  the  line  was  about  -1.9. 
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TABLE  3 


BIBLIOMETRIC  DATA 

Percent  Papers/ 

One  Author/ 


Field 

Authors 

Papers 

Years 

Paper 

Year 

Chemists 

6,891 

22,939 

10 

58 

0.33 

Mathematicians 

278 

1,124 

25 

48 

0.16 

Fluidics 

401 

529 

9 

69 

0.15 

Drosophilists 

826 

3,662 

33 

51 

0.13 

Econometric ians 

721 

1,759 

20 

60 

0.12 

Operations  Research 

783 

1,158 

15 

76 

0.10 

ACM 

420 

383 

10 

83 

0.09 

Biologists 

130 

264 

28 

59 

0.07 

TABLE  4 

LOTKA’S  FREQUENCY  DISTRIBUTION 
OF  CHEMICAL  ABSTRACTS  1907-1916 

A & B 

Abstracts  Chemists 

1 
2 

3 

4 

5 

6 


346  1 

6,891 


3,991 

1,059 

493 

287 

184 

131 
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FIG  Is  PERCENT  AUTHORS  WITH  A GIVEN  NUMBER  OF  ABSTRACTS 


100 


FIG.  2:  PERCENT  AUTHORS  VERSUS  ABSTRACTS 


Lotka  then  examined  some  data  for  publications  of  physicists 
and  found  a slope  of  about  -2.0.  He  therefore  proposed  an  inverse 
square  law  of  scientific  productivity  for  chemists  and  physicists. 


Lotka' s Inverse  Square  Law  is: 


y is  Frequency  of  Authors 
x is  Number  of  Papers. 

FREQUENCY  DISTRIBUTIONS 

There  have  been  a number  of  different  theoretical  approaches 
to  this  question  of  frequency  distributions.  Some  workers  were 
not  aware  of  Lotka' s law,  while  other  have  attempted  to  find  a 
better  formulation.  For  example,  Williams,  in  studying  the  publi- 
cations of  biologists,  examined  a geometric  series,  equation  (2). 
Williams'  Geometric  Series  is: 


n^,  n.^x,  n^x  , etc. 


(2) 


Where  n^  is  number  of  authors  of  one  paper. 


nl  “ N 


S is  total  authors 


where 


x = 


(N-S) 


N is  total  papers. 


N 


He  also  explored  the  possible  use  of  Fisher's  logarithmic  series, 
equation  (3)  and  found  a good  fit  for  biologists. 
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i 


etc . 


(3) 


Fisher's  Logarithmic  Series  is: 


n-^x  n1x  i^x 
nl'  ~2~'  ~3  ' ~~4  ' 


Where  is  number  of  authors  of  one  paper 
n 


i 

S = —[-SffA l-x)]  where  S is  total  authors, 


n. 


N = 


l-x 


where  N is  total  papers. 


Simon,  reference  4,  proposed  a Beta-function  model  which 
he  called  the  "Yule"  distribution,  equation  (4).  Simon's  Yule 
distribution  is: 

f (i)  = A B (i,  p+1)  . 

Where  f (i)  is  number  of  authors  of  i papers, 

A is  a constant 


(4) 


where  a = — 
N 


p is 


1-a 


S total  authors 
N total  papers 

B(i,p+1)  is  Beta-function  of  i,  p+1. 


B (i , p+1) 


= |V  X(l- 


Ap)dA  = -7— — (-P— ' — (o<i,  o<p<°°) 
r(i+p+l) 


The  Weibull  distribution,  reference  5,  has  also  been  considered 
for  its  applicability,  equation  (5) . The  Weibull  Frequency  Distri- 
bution is: 

- (x-C) B 

F (x)  = 1-e  A . (5) 
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A 


Where  F(x)  is  Cumulative  Distribution  Function 
X is  Number  of  Papers 
A is  Scale  Parameter 
B is  Shape  Parameter 
C is  Location  Parameter. 

Predictions  of  frequency  distributions  using  these  theo- 
retical models  have  been  made  for  a number  of  data  bases  for 
different  disciplines.  For  example,  the  distribution  of  publi- 
cations (reference  6)  of  biologists  studying  the  genetics  of 
Drosophila  during  the  period  1905-1938  has  been  compared  with 
various  predictions  as  illustrated  in  Figures  3 through  7.  Table 
5 shows  that  the  Weibull  distribution  would  appear  to  be  the  best 
fit  of  those  tested.  Further  research  is  underway. 

CONCLUSION 

Preliminary  examination  of  the  problem  of  forecasting  how 
many  papers  a chemist  or  a biologist  might  publish  in  his  lifetime 
has  indicated  wide  variations  in  productivity.  Many  chemists  may 
never  publish.  For  those  who  do  publish,  it  should  be  possible  to 
predict  a frequency  distribution.  Research  is  now  underway  exploring 
the  applicability  of  the  Weibull  distribution  for  this  purpose. 
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TABLE 


Number  of  Authors 


■ I 


Number  of  Authors 


PIG.  4:  NUMBER  OF  AUTHORS  PREDICTED  BY  WILLIAMS'  DISTRIBUTION 
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Number  of  Authors 


FIG.  7:  NUMBER  OF  AUTHORS  PREDICTED  BY  WEIBULL  DISTRIBUTION 
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